AITopics | apprenticeship learning

Collaborating Authors

apprenticeship learning

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Subject-driven Text-to-Image Generation via Apprenticeship Learning

Neural Information Processing SystemsDec-25-2025, 15:16:11 GMT

Recent text-to-image generation models like DreamBooth have made remarkable progress in generating highly customized images of a target subject, by fine-tuning an ``expert model'' for a given subject from a few examples.However, this process is expensive, since a new expert model must be learned for each subject. In this paper, we present SuTI, a Subject-driven Text-to-Image generator that replaces subject-specific fine tuning with {in-context} learning.Given a few demonstrations of a new subject, SuTI can instantly generate novel renditions of the subject in different scenes, without any subject-specific optimization.SuTI is powered by {apprenticeship learning}, where a single apprentice model is learned from data generated by a massive number of subject-specific expert models. Specifically, we mine millions of image clusters from the Internet, each centered around a specific visual subject. We adopt these clusters to train a massive number of expert models, each specializing in a different subject. The apprentice model SuTI then learns to imitate the behavior of these fine-tuned experts. SuTI can generate high-quality and customized subject-specific images 20x faster than optimization-based SoTA methods. On the challenging DreamBench and DreamBench-v2, our human evaluation shows that SuTI significantly outperforms existing models like InstructPix2Pix, Textual Inversion, Imagic, Prompt2Prompt, Re-Imagen and DreamBooth.

expert model, name change, subject-driven text-to-image generation, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.82)
Information Technology > Artificial Intelligence > Vision (0.65)

Add feedback

PAC Apprenticeship Learning with Bayesian Active Inverse Reinforcement Learning

Bajgar, Ondrej, Gould, Dewi S. W., Liu, Jonathon, Abate, Alessandro, Gatsis, Konstantinos, Osborne, Michael A.

arXiv.org Artificial IntelligenceSep-22-2025

As AI systems become increasingly autonomous, reliably aligning their decision-making with human preferences is essential. Inverse reinforcement learning (IRL) offers a promising approach to infer preferences from demonstrations. These preferences can then be used to produce an apprentice policy that performs well on the demonstrated task. However, in domains like autonomous driving or robotics, where errors can have serious consequences, we need not just good average performance but reliable policies with formal guarantees -- yet obtaining sufficient human demonstrations for reliability guarantees can be costly. Active IRL addresses this challenge by strategically selecting the most informative scenarios for human demonstration. We introduce PAC-EIG, an information-theoretic acquisition function that directly targets probably-approximately-correct (PAC) guarantees for the learned policy -- providing the first such theoretical guarantee for active IRL with noisy expert demonstrations. Our method maximises information gain about the regret of the apprentice policy, efficiently identifying states requiring further demonstration. We also present Reward-EIG as an alternative when learning the reward itself is the primary objective. Focusing on finite state-action spaces, we prove convergence bounds, illustrate failure modes of prior heuristic methods, and demonstrate our method's advantages experimentally.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2508.03693

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > Wisconsin > Dane County > Madison (0.04)
Europe > United Kingdom > England > Hampshire > Southampton (0.04)

Genre:

Research Report (1.00)
Workflow (0.93)

Industry: Transportation (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Subject-driven Text-to-Image Generation via Apprenticeship Learning

Neural Information Processing SystemsJan-18-2025, 19:01:27 GMT

Recent text-to-image generation models like DreamBooth have made remarkable progress in generating highly customized images of a target subject, by fine-tuning an expert model'' for a given subject from a few examples.However, this process is expensive, since a new expert model must be learned for each subject. In this paper, we present SuTI, a Subject-driven Text-to-Image generator that replaces subject-specific fine tuning with {in-context} learning.Given a few demonstrations of a new subject, SuTI can instantly generate novel renditions of the subject in different scenes, without any subject-specific optimization.SuTI is powered by {apprenticeship learning}, where a single apprentice model is learned from data generated by a massive number of subject-specific expert models. Specifically, we mine millions of image clusters from the Internet, each centered around a specific visual subject. We adopt these clusters to train a massive number of expert models, each specializing in a different subject. The apprentice model SuTI then learns to imitate the behavior of these fine-tuned experts. SuTI can generate high-quality and customized subject-specific images 20x faster than optimization-based SoTA methods.

apprenticeship learning, expert model, subject-driven text-to-image generation, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.64)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.64)

Add feedback

Algorithms for Learning Markov Field Policies

Neural Information Processing SystemsMar-14-2024, 13:48:51 GMT

We use a graphical model for representing policies in Markov Decision Processes. This new representation can easily incorporate domain knowledge in the form of a state similarity graph that loosely indicates which states are supposed to have similar optimal actions. A bias is then introduced into the policy search process by sampling policies from a distribution that assigns high probabilities to policies that agree with the provided state similarity graph, i.e. smoother policies.

bellman error, learning, value function, (14 more...)

Neural Information Processing Systems

Country:

North America > United States (0.04)
Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Robots (0.94)
(2 more...)

Add feedback

A Reduction from Apprenticeship Learning to Classification

Neural Information Processing SystemsFeb-16-2024, 09:52:37 GMT

We provide new theoretical results for apprenticeship learning, a variant of reinforcement learning in which the true reward function is unknown, and the goal is to perform well relative to an observed expert. We study a common approach to learning from expert demonstrations: using a classification algorithm to learn to imitate the expert's behavior. Although this straightforward learning strategy is widely-used in practice, it has been subject to very little formal analysis. We prove that, if the learned classifier has error rate \eps, the difference between the value of the apprentice's policy and the expert's policy is O(\sqrt{\eps}) . Further, we prove that this difference is only O(\eps) when the expert's policy is close to optimal.

apprenticeship learning, classification, demonstration, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

A Game-Theoretic Approach to Apprenticeship Learning

Neural Information Processing SystemsApr-6-2023, 14:51:08 GMT

We study the problem of an apprentice learning to behave in an environment with an unknown reward function by observing the behavior of an expert. We follow on the work of Abbeel and Ng [1] who considered a framework in which the true reward function is assumed to be a linear combination of a set of known and observable features. We give a new algorithm that, like theirs, is guaranteed to learn a policy that is nearly as good as the expert's, given enough examples. However, unlike their algorithm, we show that ours may produce a policy that is substantially better than the expert's. Moreover, our algorithm is computationally faster, is easier to implement, and can be applied even in the absence of an expert.

algorithm, apprenticeship learning, game-theoretic approach, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Mathematics of Computing (0.45)
Information Technology > Game Theory (0.45)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.40)

Add feedback

Safety-Aware Multi-Agent Apprenticeship Learning

Zhao, Junchen

arXiv.org Artificial IntelligenceJan-24-2022

As the rapid development of Artifical Intelligence in the current technology field, Reinforcement Learning has been proven as a powerful technique that allows autonomous agents to learn optimal behaviors (called policies) in unknown and complex environments through models of rewards and penalization. However, in order to make this technique (Reinforcement Learning) work correctly and get the precise reward function, which returns the feedback to the learning agent about when the agent behaves correctly or not, the reward function needs to be thoroughly specified. As a result, in real-world complex environments, such as autonomous driving, specifying a correct reward function could be one of the hard tasks to tackle for the Reinforcement Learning model designers. To this end, Apprenticeship Learning techniques, in which the agent can infer a reward function from expert behaviors, are of high interest due to the fact that they could result in highly specified reward function efficiently. However, for critical tasks such as autonomous driving, we need to critically consider about the safety-related issues, so as to we need to build techniques to automatically check and ensure that the inferred rewards functions and policies resulted from the Reinforcement Learning model fulfill the needed safety requirements of the critical tasks that we have mentioned previously. In order to have a well-designed Reinforcement Learning model, which is able to generate the highly-specified reward function satisfying the safety-related considerations, the technique called "Safety-Aware Apprenticeship Learning" was built in 2018[23], which would be introduced in detail in the later sections. Although the technique "Safety-Aware Apprenticeship Learning" has been built, it only considers Single-Agent scenario. In the other word, the current "Safety-Aware Apprenticeship Learning" technique can only be applied to one agent running in an isolated environment, a fact which limits the potential implementation of this technique.

agent, apprenticeship learning, learning, (14 more...)

arXiv.org Artificial Intelligence

2201.08111

Genre: Research Report (0.50)

Industry:

Education (0.94)
Information Technology (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Online Apprenticeship Learning

Shani, Lior, Zahavy, Tom, Mannor, Shie

arXiv.org Machine LearningFeb-13-2021

In Apprenticeship Learning (AL), we are given a Markov Decision Process (MDP) without access to the cost function. Instead, we observe trajectories sampled by an expert that acts according to some policy. The goal is to find a policy that matches the expert's performance on some predefined set of cost functions. We introduce an online variant of AL (Online Apprenticeship Learning; OAL), where the agent is expected to perform comparably to the expert while interacting with the environment. We show that the OAL problem can be effectively solved by combining two mirror descent based no-regret algorithms: one for policy optimization and another for learning the worst case cost. To this end, we derive a convergent algorithm with $O(\sqrt{K})$ regret, where $K$ is the number of interactions with the MDP, and an additional linear error term that depends on the amount of expert trajectories available. Importantly, our algorithm avoids the need to solve an MDP at each iteration, making it more practical compared to prior AL methods. Finally, we implement a deep variant of our algorithm which shares some similarities to GAIL \cite{ho2016generative}, but where the discriminator is replaced with the costs learned by the OAL problem. Our simulations demonstrate our theoretically grounded approach outperforms the baselines.

algorithm, iteration, optimization, (16 more...)

arXiv.org Machine Learning

2102.06924

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
Asia > Middle East > Israel > Haifa District > Haifa (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Education > Educational Setting > Online (0.86)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

Add feedback

A Reduction from Apprenticeship Learning to Classification

Syed, Umar, Schapire, Robert E.

Neural Information Processing SystemsFeb-15-2020, 03:41:50 GMT

We provide new theoretical results for apprenticeship learning, a variant of reinforcement learning in which the true reward function is unknown, and the goal is to perform well relative to an observed expert. We study a common approach to learning from expert demonstrations: using a classification algorithm to learn to imitate the expert's behavior. Although this straightforward learning strategy is widely-used in practice, it has been subject to very little formal analysis. We prove that, if the learned classifier has error rate $\eps$, the difference between the value of the apprentice's policy and the expert's policy is $O(\sqrt{\eps})$. Further, we prove that this difference is only $O(\eps)$ when the expert's policy is close to optimal.

apprenticeship learning, classification, demonstration, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.99)

Add feedback

Exploring Apprenticeship Learning for Player Modelling in Interactive Narratives

Rivera-Villicana, Jessica, Zambetta, Fabio, Harland, James, Berry, Marsha

arXiv.org Artificial IntelligenceSep-16-2019

In this paper we present an early Apprenticeship Learning approach to mimic the behaviour of different players in a short adaption of the interactive fiction Anchorhead. Our motivation is the need to understand and simulate player behaviour to create systems to aid the design and person-alisation of Interactive Narratives (INs). INs are partially observable for the players and their goals are dynamic as a result. We used Receding Horizon IRL (RHIRL) to learn players' goals in the form of reward functions, and derive policies to imitate their behaviour. Our preliminary results suggest that RHIRL is able to learn action sequences to complete a game, and provided insights towards generating behaviour more similar to specific players.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3341215.3356314

1909.07268

Country:

North America > United States (0.28)
Oceania > Australia > Victoria > Melbourne (0.14)

Genre: Research Report (0.70)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback